52 research outputs found

    An extensible web interface for databases and its application to storing biochemical data

    Full text link
    This paper presents a generic web-based database interface implemented in Prolog. We discuss the advantages of the implementation platform and demonstrate the system's applicability in providing access to integrated biochemical data. Our system exploits two libraries of SWI-Prolog to create a schema-transparent interface within a relational setting. As is expected in declarative programming, the interface was written with minimal programming effort due to the high level of the language and its suitability to the task. We highlight two of Prolog's features that are well suited to the task at hand: term representation of structured documents and relational nature of Prolog which facilitates transparent integration of relational databases. Although we developed the system for accessing in-house biochemical and genomic data the interface is generic and provides a number of extensible features. We describe some of these features with references to our research databases. Finally we outline an in-house library that facilitates interaction between Prolog and the R statistical package. We describe how it has been employed in the present context to store output from statistical analysis on to the database.Comment: Online proceedings of the Joint Workshop on Implementation of Constraint Logic Programming Systems and Logic-based Methods in Programming Environments (CICLOPS-WLPE 2010), Edinburgh, Scotland, U.K., July 15, 201

    Exploiting independence for branch operations in Bayesian learning of C&RTs

    Get PDF
    In this paper we extend a methodology for Bayesian learning via MCMC, with the ability to grow arbitrarily long branches in C&RT models. We are able to do so by exploiting independence in the model construction process. The ability to grow branches rather than single nodes has been noted as desirable in the literature. The most singular feature of the underline methodology used here in comparison to other approaches is the coupling of the prior and the proposal. The main contribution of this paper is to show how taking advantage of independence in the coupled process, can allow branch growing and swapping for proposal models

    Advances in Big Data Bio Analytics

    Get PDF
    Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming

    A primer on correlation-based dimension reduction methods for multi-omics analysis

    Full text link
    The continuing advances of omic technologies mean that it is now more tangible to measure the numerous features collectively reflecting the molecular properties of a sample. When multiple omic methods are used, statistical and computational approaches can exploit these large, connected profiles. Multi-omics is the integration of different omic data sources from the same biological sample. In this review, we focus on correlation-based dimension reduction approaches for single omic datasets, followed by methods for pairs of omics datasets, before detailing further techniques for three or more omic datasets. We also briefly detail network methods when three or more omic datasets are available and which complement correlation-oriented tools. To aid readers new to this area, these are all linked to relevant R packages that can implement these procedures. Finally, we discuss scenarios of experimental design and present road maps that simplify the selection of appropriate analysis methods. This review will guide researchers navigate the emerging methods for multi-omics and help them integrate diverse omic datasets appropriately and embrace the opportunity of population multi-omics.Comment: 30 pages, 2 figures, 6 table

    Advances in big data bio analytics

    Get PDF
    Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming

    A logical approach to working with biological databases

    Get PDF
    It has been argued before that Prolog is a strong candidate for research and code develop-ment in bioinformatics and computational biology. This position has been based on boththe intrinsic strengths of Prolog and recent advances in its technologies. Here we strengthenthe case for the deployment and penetration of Prolog into bioinformatics, by introduc-ingbiodb, a comprehensive and extensible system for working with biological data. Wefocus on databases that translate between biological products and product-to-productinteractions, the latter of which can be visualised as graphs. This library allows easy ac-cess to high quality data in two formats: as Prolog fact files and as SQLite databases.On-demand downloading of prepacked data files in these two formats is supported in alloperating system architectures as well as reconstruction from latest data files from thecurated databases. The methods used to deliver the data are transparent to the user andthe data are delivered in he familiar format of Prolog facts

    ATG9A loss confers resistance to trastuzumab via c-Cbl mediated Her2 degradation

    Get PDF
    Acquired or de novo resistance to trastuzumab remains a barrier to patient survival and mechanisms underlying this still remain unclear. Using stable isotope labelling by amino acids in cell culture (SILAC)-based quantitative proteomics to compare proteome profiles between trastuzumab sensitive/resistant cells, we identified autophagy related protein 9A (ATG9A) as a down-regulated protein in trastuzumab resistant cells (BT474-TR). Interestingly, ATG9A ectopic expression markedly decreased the proliferative ability of BT474-TR cells but not that of the parental line (BT474). This was accompanied by a reduction of Her2 protein levels and AKT phosphorylation (S473), as well as a decrease in Her2 stability, which was also observed in JIMT1 and MDA-453, naturally trastuzumab-resistant cells. In addition, ATG9A indirectly promoted c-Cbl recruitment to Her2 on T1112, a known c-Cbl docking site, leading to increased K63 Her2 polyubiquitination. Whereas silencing c-Cbl abrogated ATG9A repressive effects on Her2 and downstream PI3K/AKT signaling, its depletion restored BT474-TR proliferative rate. Taken together, our findings show for this first time that ATG9A loss in trastuzumab resistant cells allowed Her2 to escape from lysosomal targeted degradation through K63 poly-ubiquitination via c-Cbl. This study identifies ATG9A as a potentially druggable target to overcome resistance to anti-Her2 blockade

    Proteomic profile of KSR1-regulated signalling in response to genotoxic agents in breast cancer

    Get PDF
    Kinase suppressor of Ras 1 (KSR1) has been implicated in tumorigenesis in multiple cancers, including skin, pancreatic and lung carcinomas. However, our recent study revealed a role of KSR1 as a tumour suppressor in breast cancer, the expression of which is potentially correlated with chemotherapy response. Here, we aimed to further elucidate the KSR1-regulated signalling in response to genotoxic agents in breast cancer. Stable isotope labelling by amino acids in cell culture (SILAC) coupled to high-resolution mass spectrometry (MS) was implemented to globally characterise cellular protein levels induced by KSR1 in the presence of doxorubicin or etoposide. The acquired proteomic signature was compared and GO-STRING analysis was subsequently performed to illustrate the activated functional signalling networks. Furthermore, the clinical associations of KSR1 with identified targets and their relevance in chemotherapy response were examined in breast cancer patients. We reveal a comprehensive repertoire of thousands of proteins identified in each dataset and compare the unique proteomic profiles as well as functional connections modulated by KSR1 after doxorubicin (Doxo-KSR1) or etoposide (Etop-KSR1) stimulus. From the up-regulated top hits, several proteins, including STAT1, ISG15 and TAP1 are also found to be positively associated with KSR1 expression in patient samples. Moreover, high KSR1 expression, as well as high abundance of these proteins, is correlated with better survival in breast cancer patients who underwent chemotherapy. In aggregate, our data exemplify a broad functional network conferred by KSR1 with genotoxic agents and highlight its implication in predicting chemotherapy response in breast cancer

    Biological and prognostic impact of apobec-induced mutations in the spectrum of plasma cell dyscrasias

    Get PDF
    In multiple myeloma (MM), whole exome sequencing (WES) studies have revealed four mutational signatures: two associated with aberrant activities of APOBEC cytidine deaminases (Signatures #2 and #13) and two clock-like signatures associated with "cancer age" (Signatures #1 and #5). Mutational signatures have not been investigated systematically in larger series, nor in other primary plasma cell dyscrasias such as monoclonal gammopathy of unknown significance (MGUS) or primary plasma cell leukemia (pPCL). Finally, while APOBEC activity has been correlated to increased mutational burden and poor-prognosis MAF/MAFB translocations in MM at diagnosis, this has never been confirmed in multivariate analysis in an independent series. To answer these questions, we mined 1151 MM samples from public WES datasets, including samples from the IA9 public release of the CoMMpass trial. The CoMMpass data were generated as part of the Multiple Myeloma Research Foundation Personalized Medicine Initiatives. We also analyzed 6 MGUS/Smoldering MM as well as 5 previously published pPCLs. Extraction of mutational signatures was performed using the NNMF algorithm as previously described (Alexandrov et al. Nature 2013). NNMF in the whole cohort extracted the known 4 signatures pertaining to distinct mutational processes: the two clock-like processes (signatures #1 and #5) and aberrant APOBEC deaminase activity (signatures #2 and #13). While the clock-like processes were more prominent in the cohort as a whole (median 70%, range 0-100%), the APOBEC showed a heterogeneous contribution, more visible in samples with the highest mutation burden. In fact, the absolute and relative contribution of APOBEC activity to the mutational repertoire correlated with the overall number of mutations (r=0.71, p= < 0.0001). As previously described, APOBEC contribution was significantly enriched among MM patients with t(14;16) and with t(14;20) (p<0.001), but the association between relative APOBEC contribution and mutational load remained significant across all cytogenetic subgroups with the exception of t(11;14). In the MGUS/SMM series, APOBEC contribution was generally low. Conversely, APOBEC activity was preponderant in three out of five pPCL samples, all of them characterized by the t(14;16)( IGH / MAF); in the remaining two pPCL the absolute number of APOBEC mutations was similar to MM. Overall, the APOBEC contribution was characterized by a progressive increment from MGUS/SMM to MM and pPCL. We next went on to investigate the prognostic impact of APOBEC signatures at diagnosis. Patients with APOBEC contribution in the 4th quartile had shorter PFS (2-y PFS 47% vs 66%, p<0.0001) and OS (2-y OS 70% vs 85%, p=0.0033) than patients in quartiles 1-3 (Figure 1a-b). This was independent from the association of APOBEC activity with MAF translocations and higher mutational burden, as shown by multivariate analysis with Cox regression (Figure 1c-d). ISS stage III was the only other variable that retained its independent prognostic value for both PFS and OS. We therefore combined both variables and found that co-occurrence of ISS III and APOBEC 4th quartile identifies a fraction of high-risk patients with 2-y OS of 53.8% (95% CI 36.6%-79%), while their simultaneous absence identifies long term survivors with 2-y OS of 93.3% (95% CI 89.6-97.2%). In this study, we provided a global overview on the contribution of mutational processes in the largest whole exome series of plasma cell dyscrasias investigated to date by NNMF. We propose that cases with high APOBEC activity may represent a novel prognostic subgroup that is transversal to conventional cytogenetic subgroups, advocating for closer integration of next-generation sequencing studies and clinical annotation to confirm this finding in independent series
    • …
    corecore